Highly Optimized Code for Lattice Quantum Chromodynamics on the CRAY T3E
نویسندگان
چکیده
In order to compute physical quantities in lattice quantum chromodynamics huge systems of linear equations have to be solved. The availability of eecient parallel Krylov subspace solvers plays a vital role in the solution of these systems. We present a detailed analysis of the performance of the stabilized biconjugate gradient (BiCGStab) algorithm with symmetric successive over-relaxed (SSOR) preconditioning on a massively parallel CRAY T3E system. The numerical investigation of quantum chromodynamics (QCD) on a four-dimensional space-time grid is one of the grand challenges in high-performance scientiic computing 1]. QCD is generally considered to be the fundamental theory which describes the strong forces binding quarks with gluons to form the known hadrons like the proton or neutron 2]. Even after 20 years of research, QCD still has not been solved in a non-perturbative analytical approach, and it is by now widely believed that the controlled numerical treatment of the theory on the lattice using very fast parallel supercomputers is the only viable scheme to extract quantitative physical results 3]. The results from lattice gauge theory (LGT) simulations are urgently needed as theoretical input for current and future accelerator experiments that attempt to observe new physics beyond the Standard Model of elementary particle physics 4]. LGT computes functional integrals using Monte Carlo methods known from statistical physics 5]. A representative ensemble of eld conngurations is generated by a Markov process (simulation phase). These conngurations are subsequently analysed by constructing (and averaging over) hadronic correlators from quark Green's functions (analysis phase). Subsequently correlators serve to extract physical observables of interest like hadron masses and decay constants.
منابع مشابه
Parallel J-W Monte Carlo Simulations of Thermal Phase Changes in Finite-size Systems
The thermodynamic properties of (TeF6)59 clusters that undergo temperature-driven phase transitions have been calculated with a canonical J-walking Monte Carlo technique. A parallel code for simulations has been developed and optimized on SUN3500 and CRAY-T3E computers. The Lindemann criterion shows that the clusters transform from liquid to solid and then from one solid structure to another in...
متن کاملNetwork Simulation on Cray-T3E using MPI
We propose a novel approach for parallel discrete-event network simulation on packet-switched, point-to-point networks. Our algorithm resolves packet connicts through priority sorting of appropriate integer connict functions. We implement our method on CM-5, Cray-T3D, and Cray-T3E systems using C and MPI, and perform critical optimizations aimed at reducing sorting overhead, minimizing inter-pr...
متن کاملComputational aspects of a code to study rotating turbulent convection in spherical shells
The coupling of highly turbulent convection with rotation within a full spherical shell geometry, such as in the solar convection zone, can be studied with the new anelastic spherical harmonic (ASH) code developed to exploit massively parallel architectures. Inter-processor transposes are used to ensure data locality in spectral transforms, a sophisticated load balancing algorithm is implemente...
متن کاملScalability of a tera-scale linux-based clusters for parallel ab initio molecular dynamics applications
KISTI supercomputing center has initiated a TeraCluster project to build a linux-based cluster with tera-flops performance. The main goal of the project is to provide a resources composed of PC clusters that meet the level of computing power required by the grand challenge applications in Korea. In the beginning of 2002, we have built a prototype of TeraCluster with 128 computing nodes, Phase-I...
متن کاملAdvances in Modeling the Generation of the Geomagnetic Field
DYNAMO is the first simulation code to dynamically and self-consistently model the evolution of the geomagnetic field. In this paper, we present the results of recent efforts to parallelize and optimize this pseuodospectral code for massively parallel architectures. On the Cray T3E, these modifications have resulted in a sustained performance of well-over 100 Gflops, representing more than a 10...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1997